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This is in response to the appeal brief filed 1 1/02/2009 appealing from the Office action 
mailed 06/08/2009. 

(1) Real Party in Interest 

A statement identifying by name the real party in interest is contained in the brief. 

(2) Related Appeals and Interferences 

The examiner is not aware of any related appeals, interferences, or judicial 
proceedings which will directly affect or be directly affected by or have a bearing on the 
Board's decision in the pending appeal. 

(3) Status of Claims 

The statement of the status of claims contained in the brief is correct. 

(4) Status of Amendments After Final 

The appellant's statement of the status of amendments after final rejection 
contained in the brief is correct. 

(5) Summary of Claimed Subject Matter 

The summary of claimed subject matter contained in the brief is correct. 
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(6) Grounds of Rejection to be Reviewed on Appeal 

The appellant's statement of the grounds of rejection to be reviewed on appeal is 
correct. 

(7) Claims Appendix 

The copy of the appealed claims contained in the Appendix to the brief is correct. 

(8) Evidence Relied Upon 

5,920,834 Sihetal. 7-1999 

6,865,162 Clemm 3-2005 

7,346,005 Dowdal 3-2008 

2004/0073692 Gentle et al. 4-2004 

(9) Grounds of Rejection 

The following ground(s) of rejection are applicable to the appealed claims: 



Claim Rejections - 35 USC § 103 

4. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed 
or described as set forth in section 102 of this title, if the differences between the 
subject matter sought to be patented and the prior art are such that the subject 
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matter as a whole would have been obvious at the time the invention was made 
to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was 
made. 



5. Claims 1,5, 7, 9, 13, 14, and 19 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Gentle et al. (US 2004/0073692) in view of Dowdal (US 7,346,005). 
As to claims 1 and 14, Gentle teaches a method, comprising: 

receiving a plurality of packets (see [0036], VAD monitors packet 
structures in incoming digital voice stream) with audio information (see Abstract, 
audio stream, also see [0036], voice) (e.g. Applicant defines audio information to 
include voice and silence (see page 4, [0006], lines 3-5). Audio packets are 
retrieved.); 

determining by a voice activity detector (see [0036], VAD 220) whether 
said audio information represents voice information (see [0036], VAD determines 
if voice activity is present) (e.g. The determination of the audio information is 
found by the voice activity detector 220.); and 

buffering said audio information in a jitter buffer (see Figure 3, buffer 
manager 330) during said determination (see [0051], [0052], [0042], [0061], 
where the packets are received by the VAD based on incoming stream and the 
packets and packet structure and sends it to agent 232 where a packet is 
received one at a time and the buffer receives the packets to be buffered upon 
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receipt ) (e.g. It is obvious to one of ordinary skilled in the art that as the VAD 
outputs the packets (see Figure 2, output of 220) and sends it to the second 
device (Figure 3, input into buffer 302) that when new audio data is received by 
the first device that processing by the VAD will occur while the previous packets 
are being buffered. Support for this in Gentle is seen in [0052], where the VAD 
monitors for new data) (e.g. The sending of the jitter delay from the adaptive 
playout unit further supports the determining (VAD decision) and jitter buffering 
being done concurrently.). The reference also teaches the use of a computer 
entailing a computer readable medium for the above limitations (see [0030])) 
(e.g. Audio information is buffered.). 

wherein said determining comprises: 

receiving frames of audio information at a voice activity detector (see 
[0036], packet structures are received); 

measuring at least one characteristic (see [0051], [0036], and [0004], 
where the Reference discloses convention technique and shows an alternative 
based on silence threshold to determine voice activity and energy level 
measurements) of said frames (see [0036], packet structure ) 

determining a start of voice information based on said measurements (see 
[0052], VAD 220 determines silence or nonsilence as well as beginning and 
endpoints); and 

determining an end to said voice information based on said (see [0052], 
VAD 220 determines silence or nonsilence as well as beginning and endpoints) 
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and a delay interval (see [0051], timing measurement module used to determine 
jitter by VAD 220); and 

sending the adjusted packets to a voice codec (see [0075], where the 
adaptive playout unit is placed before the codec and in [0043], [0045], the jitter 
timing module has an adaptive control of FIFO delay) (e.g. It would have been 
obvious to one of ordinary skilled in the art to have used this embodiment in 
order to for the playout unit to receive encoded packets rather than decoded 
packet, where additional delay is present due to processing.). 

adjusting of the delay interval (see [0043], timing measurement module 
allows adaptive control of FIFO delay) 

However, Gentle does not specifically teach the measuring, adding, and 
adjusting of the delay interval based on an average packet delay time. 

Dowdal teaches 

measuring an average packet delay time by said jitter buffer (see Dowdal, 
(see col. 4, lines 33-60, delay between packets are calculated and a calculated 
running average is maintained in order to reset the value of the FIFO buffer for 
playout) 

adding said average packet delay time to each of the plurality of packets 
prior to sending the plurality of packets (see col. 2, lines 42-44 and col. 3, lines 
34-41 , where the playout delay is adjusted based on the calculated delay value. 
It is implied in Dowdal that the calculated delay is added (either a negative or 
positive value is applied). 
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the adjusting said delay interval to correspond to an average packet delay 
time (see col. 4, lines 33-60, delay between packets are calculated and a 
calculated running average is maintained in order to reset the value of the FIFO 
buffer for playout). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the voice based packet network as 
taught by Gentle with the use of a delay based on the average packet delay time 
as taught by Dowdal. The motivation to have combined the two references 
involves the improvement in audio quality for effective playout of audio by 
minimizing jitter and delay (see Dowdal, col. 1 , lines 1 5-21 ). 



As to claim 5, Gentle in view of Dowdal teaches all of the limitations as in claim 
1 , above. 

Furthermore, Gentle teaches said characteristic comprises an estimate of 
an energy level for said frame (see [0051], energy level measurement can be 
employed by VAD 220) (e.g. An energy level is used to determine if speech is 
present.). 



As to claim 9, Gentle teaches a system comprising: 

an antenna (see [0030], radio, telephone, wired analog, etc. )(e.g. It is 
inherent that digital phones consist of built-in antenna as well as a receiver for 
hearing audio information and transmitter for transmitting information. ); 
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a receiver connected to said antenna (see [0030], radio, telephone, wired 
analog, etc. and see Figure 3, 228, receives information from first user and 
[0040]) to receive a frame of information (e.g. The receiver receives the packets 
of information from first user) 

a voice activity detector (see [0036], VAD determines if voice activity is 
present) to detect voice information in said frame see [0036], VAD monitors 
packet structures in incoming digital voice stream) (e.g. The determination of the 
audio information is found by the voice activity detector 220.); and 

a jitter buffer (see Figure 3, buffer manager 330) to buffer said information 
during said detection by said voice activity detector (see [0051], [0052], [0042], 
[0061], where the packets are received by the VAD based on incoming stream 
and the packets and packet structure and sends it to agent 232 where a packet is 
received one at a time and the buffer receives the packets to be buffered upon 
receipt ) (e.g. It is obvious to one of ordinary skilled in the art that as the VAD 
outputs the packets (see Figure 2, output of 220) and sends it to the second 
device (Figure 3, input into buffer 302) that when new audio data is received by 
the first device that processing by the VAD will occur while the previous packets 
are being buffered. Support for this in Gentle is seen in [0052], where the VAD 
monitors for new data) (e.g. The sending of the jitter delay in [0051] from the 
adaptive playout unit further supports the determining (VAD decision) and jitter 
buffering being done concurrently.), sending the adjusted packets to a voice 
codec (see [0075], where the adaptive playout unit is placed before the codec 
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and in [0043], [0045], the jitter timing module has an adaptive control of FIFO 
delay) (e.g. It would have been obvious to one of ordinary skilled in the art to 
have used this embodiment in order to for the playout unit to receive encoded 
packets rather than decoded packet, where additional delay is present due to 
processing). 

wherein said voice activity detector receives frames of audio information, 
measures at least one characteristic of said frames (see [0051], [0036], and 
[0004], where the Reference discloses convention technique and shows an 
alternative based on silence threshold to determine voice activity and energy 
level measurements and (see [0036], packet structure ), determines a start of 
voice information based on said measurements (see [0052], VAD 220 
determines silence or nonsilence as well as beginning and endpoints), 
determines an end to said voice information based on said (see [0052], VAD 220 
determines silence or nonsilence as well as beginning and endpoints) and a 
delay interval (see [0051], timing measurement module used to determine jitter 
by VAD 220), adjusting of the delay interval (see [0043], timing measurement 
module allows adaptive control of FIFO delay) 

However, Gentle does not specifically teach the measuring, adding, and 
adjusting of the delay interval based on an average packet delay time. 

Dowdal teaches 

measuring an average packet delay time by said jitter buffer (see Dowdal, 
(see col. 4, lines 33-60, delay between packets are calculated and a calculated 
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running average is maintained in order to reset the value of the FIFO buffer for 
playout) 

adding said average packet delay time to each of the plurality of packets 
prior to sending the plurality of packets (see col. 2, lines 42-44 and col. 3, lines 
34-41 , where the playout delay is adjusted based on the calculated delay value. 
It is implied in Dowdal that the calculated delay is added (either a negative or 
positive value is applied). 

the adjusting said delay interval to correspond to an average packet delay 
time (see col. 4, lines 33-60, delay between packets are calculated and a 
calculated running average is maintained in order to reset the value of the FIFO 
buffer for playout). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the voice based packet network as 
taught by Gentle with the use of a delay based on the average packet delay time 
as taught by Dowdal. The motivation to have combined the two references 
involves the improvement in audio quality for effective playout of audio by 
minimizing jitter and delay (see Dowdal, col. 1 , lines 1 5-21 ). 

As to claim 13, Gentle in view of Dowdal teaches all of the limitations as in claim 
9, above. 

Furthermore, Gentle teaches said voice activity detector further comprises 
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an estimator to estimate energy level values (see [[0051], energy level 
measurement by VAD 220) (e.g. Energy levels are estimated.); 

a voice classification module connected to said estimator to classify 
information for said frame (see [0051], VAD 220 classifies based on silence or 
non-silence) 

6. Claims 2, 3, 12, 15, and 16 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Gentle in view of Dowdal, as applied to claims 1, 9, and 14 above, 
in view of Clemm (US 6,865,162). 

As to claims 2 and 15, Gentle in view of Dowdal teach a voice based packet 
network. 

However, Gentle in view of Dowdal. does not specifically teach the 
buffering of a portion of said audio information in a pre-buffer for a predetermined 
time interval. 

Clemm does teach the use of a buffer (see col. 2, line 31 ) for a 
predetermined time (see col. 2, lines 31-33) prior to said determining (see Figure 
1, elements 110 and 120 and col. 2, lines 30-37) (e.g. A pre-buffer is used.). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the voice based packet network as 
taught by Gentle in view of Dowdal with the buffer before the voice activity 
detector as taught by Clemm. The motivation to have combined the two 
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references involve the elimination of clipping associated with voice activity 
detector directed during silence suppression (see Clemm col. 2, lines 47-48). 

As to claims 3 and 16, Gentle in view of Dowdal teaches all of the limitations as 
in claims 1 and 13, above. 

Furthermore, gentle teaches sending said information from the jitter buffer 
to an end user (see Figure 3, second user, 312) (e.g. The applicant denotes the 
endpoint to be defined as the human user (see Applicant's Specification, page 8, 
[0018], lines 5-6). (Further, the sending of audio information to the user from the 
pre-buffer would have been apparent with the teaching presented by Clemm to 
avoid clipping). 

As to claim 1 2, Gentle in view of Dowdal teach all of the limitations as in claim 9. 

Furthermore, Gentle in view of Dowdal etal. teach a voice packet based 
network. 

However, Gentle in view of Dowdal do not specifically teach the buffering 
of a portion of said audio information in a pre-buffer for a predetermined time 
interval. 

Clemm teaches further comprising a buffer to store pre-threshold speech 
during detection by voice activity detector (see Figure 1 , elements 1 10 and 120 
and col. 2, lines 30-37) (The reference buffers a pre-threshold speech based 
upon two values, from a delay.) 
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It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the voice based packet network as 
taught by Gentle in view of Dowdal with the buffer before the voice activity 
detector as taught by Clemm. The motivation to have combined the two 
references involve the elimination of clipping associated with voice activity 
detector directed during silence suppression (see Clemm ,col. 2, lines 47-48). 

7. Claims 8, 1 0, 1 1 , and 20 are rejected under 35 U.S.C. 1 03(a) as being 
unpatentable over Gentle in view of Dowdal as applied to claim 9 above, and further in 
view of Sih et al. (US 5,920,834). 

As to claims 8 and 20, Gentle in view of Dowdal teaches all of the limitations as 
in claim 1 and 14, above. 

Furthermore, Gentle teaches retrieving a frame (see Figure 2, output of 

212) of audio information from said packets (e.g. Audio information in the form of 

voice is received, which has undergone pulse code modulation); 

canceling echo from said frame of audio information (see echo canceller 

216); and 

sending said frame of audio information to a voice activity detector (see 
Figure 6, output of echo canceller 21 6 to input of VAD 220). 

However, Gentle in view of Dowdal do not specifically teach the receiving 
of an echo cancellation reference signal. 
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Sih does teach receiving an echo cancellation reference signal (col. 6, 
lines 14-18) and Figure 2, echo canceller 10, z'(n) is the reference signal.); 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have modified the voice based packet network as 
taught by Gentle in view of Dowdal with the use of a reference signal to cancel 
echo as taught by Sih for the purpose of noise suppression (see Sih, col. 3, lines 
5). 

As to claim 1 0, Gentle in view of Dowdal teach all of the limitations as in claim 9. 

Furthermore, Gentle in view of Dowdal teach a voice packet based 
network. 

However, Gentle in view of Dowdal do not specifically teach the echo 
canceller connected to a receiver to cancel the echo. 

However, Sih et al. does teach the echo canceller being connected to a 
receiver (see Figure 1 , elements 14 and 10) (e.g. It is evident that a transceiver 
consists of a receiver and a transmitter). 

It would have been obvious to one of ordinary skilled in the art at the time 
the invention was made to have the echo canceller connected to a receiver. The 
motivation to have combined the two references involves cancellation of echo for 
mobile phones that may occur in speech signals (e.g. see Sih et al., col. 23-25) 
as would have been apparent in the teachings of Gentle, which describes 
communication between telephony devices. 
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As to claim 1 1 , Gentle in view of Dowdal in view of Sih et al. teaches all of the 
limitations as in claim 9. 

Furthermore, Sih et al. teaches a transmitter (see Figure 1 , element 14) 
(e.g. Transceiver consists of a transmitter) to provide an echo cancellation signal 
to said echo canceller (see Figure 1, element 10 and col. 6, lines 14-18). 



(10) Response to Argument 



Claims 1, 5, 9, 13, and 14 are Rejected under 35 U.S.C. §103 under Gentle in 
view of Dowdal 

Appellant asserts on page 24 with respect to claims 1 and 14 

Applicant respectfully submits that Gentle fails to teach or suggest all of 
the limitations contained in claim 1. Paragraph [0051] of Gentle teaches 
forwarding the results of the jitter to the VAD. However, claim 1 teaches receiving 
a plurality of packets with audio information, determining.., whether said audio 
information represents voice information and buffering said audio information 
during said determination. The audio information with the plurality of packets is 
the same audio information at each step of the limitation. Claim 1 teaches that 
the audio information is received, that the voice detector determines whether the 
audio information is voice information and during the determining, the audio 
information is buffered. Applicant submits that Gentle teaches a serial approach 
of processing and then buffering the frame of information. Applicant submits that 
claim 1 is clearly different than the teaching of Gentle. 



In response to the Appellant's argument that the audio information with the 
plurality of packets is the same audio information at each step of the limitation, the 
examiner cannot concur. It should be noted that the present claims recite claim 
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language that is broad enough to include a serial approach as that of Gentle and which 
the Appellant admits Gentle is teaching. The Appellant is trying to distinguish over the 
cited prior art by claiming a parallel approach, where voice activity detection and jitter 
buffering is occurring at the same time for a specific packet. However, the present 
claims encompass both interpretations. The initial claim language reads "receiving a 
plurality of packets with audio information." Such limitation is broad enough to read on 
each packet in the plurality being associated with audio information, where the audio 
information is different for each packet but still represents audio information. The audio 
information of a specific packet is not being specifically excluded, as a plurality of 
packets contain audio information, which is different for each packet. Since each 
received packet contains audio information, the packet for which voice activity is being 
determined may be different from that which is buffered since only audio information is 
being claimed. The buffering limitation of the claim only states said audio information 
and does not exclude or restrict the buffering of a specific packet of audio information to 
be the same information for which VAD will process (i.e. the same packet with the 
specific audio information for which processing by the VAD is being done). Hence, the 
teachings of Gentle are not different from the teachings of the claim and the audio 
information with the plurality of packets does not have to be the same audio information 
at each step of the limitation. 

Appellant asserts on pages 24 and 25 

Applicant respectfully submits that the Examiner has not provided any 
support in the cited references directed to "buffering said audio information in a 



Application/Control Number: 10/722,038 
Art Unit: 2626 



Page 17 



jitter buffer during said determination" as recited in independent claim 1 . 
Consequently, Gentle fails to disclose, teach or suggest every element recited in 
claim 1 . Furthermore, Applicant submits that Dowdal fails to remedy the above 
identified deficiencies of Gentle. For at least these reasons, Applicant submits 
that claim 1 is patentable over the cited references, whether taken alone or in 
combination. 

In response to the Appellant's argument that the Examiner has not provided any 
support in the cited references directed to "buffering said audio information in a jitter 
buffer during said determination," the Examiner cannot concur. Support was provided in 
light of the interpretation for a serial approach in Gentle. The interpretation for which 
support was provided was the processing of a single packet being analyzed by the jitter 
buffer and the next incoming packet being processed by the VAD, where both packets 
contain audio information for the plurality of packets received. Support in Gentle is 
provided in paragraphs [0042], [0051], [0052], and [0061] and in Figure 2, VAD 220 and 
Figure 3, receive buffer 336. The cited sections and Figure describe this interpretation 
where each incoming packet is analyzed and transmitted to the second user device and 
a next incoming packet is monitored and analyzed by the first user device. 

Claim 14 presents similar features as in claim 1 and therefore are rejected for 
similar reasons as mentioned above. 

Appellant asserts on pages 26 with respect to claim 9 

Applicant respectfully submits that Gentle fails to teach or suggest all of 
the limitations contained in claim 1 . Paragraph [0051] of Gentle teaches 
forwarding the results of the jitter to the VAD. However, claim 9 teaches a 
receiver to receive a frame of information, a voice activity detector to detect voice 
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information in said frame, and a jitter buffer to buffer said information during said 
detection by the voice activity detector. Claim 9 states that the same information 
is buffered during the detection. Applicant submits that Gentle teaches a serial 
approach of processing and then buffering the frame of information. Applicant 
submits that claim 9 is clearly different than the teaching of Gentle. 

In response to the Appellant's argument that Claim 9 states the same information 
is buffered during the detection, the examiner cannot concur. It should be noted that the 
present claims recite claim language that is broad enough to include a serial approach 
as that of Gentle and which the Appellant admits Gentle is teaching and discussed 
above. The Appellant is trying to distinguish over the cited prior art by claiming a parallel 
approach, where voice activity detection and jitter buffering is occurring at the same 
time for a specific frame. However, the present claims encompass both interpretations. 
The initial claim language reads "a receiver ... to receive a frame of information." The 
VAD and jitter buffer performs functions on the same frame. However, the 5th 
paragraph of the claim recites "wherein said voice activity detector receives frames of 
audio information." The latter limitation further defines the processing of information that 
is done by the VAD. The limitation of "frames of audio information" is broad enough to 
conclude that each frame contains audio information, where each frame received in the 
plurality can contain different audio information. The audio information of a specific 
frame is not being excluded as a plurality of frames contains audio information. Since 
each frame contains audio information the frame for which voice activity is being 
determined may be different from that which is buffered. The buffering limitation of the 
claim only states said audio information and does not exclude or restrict the buffering of 
a specific frame of audio information to be the same information for which VAD will 
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process (i.e. the same frame with the specific audio information for which processing by 
the VAD is being done). Hence, the teachings of Gentle are not different from the 
teachings of the claim and the same information does not have to be buffered during the 
detection as the VAD receives plurality of frames with audio information. 

Since claims 5 and 13 stand or fall with independent claim 1, 9, and 14 please 
see the arguments presented in claim 1, 9, and 14. 

Claims 2, 3, 12, 15, and 16 are Rejected under 35 U.S.C. §103 under Gentle 
in view of Dowdal in view of Clemm 

Since the claims stand or fall with independent claim 1 , 9, and 14, please see the 
arguments presented in claim 1, 9, and 14. 

Claims 8, 10, 11, and 20 are Rejected under 35 U.S.C. §103 under Gentle in 
view of Dowdal in view of Sih 

Since the claims stand or fall with independent claim 1 , 9, and 14, please see the 
arguments presented in claim 1, 9, and 14. 

(11) Related Proceeding(s) Appendix 

No decision rendered by a court or the Board is identified by the examiner in the 
Related Appeals and Interferences section of this examiner's answer. 
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For the above reasons, it is believed that the rejections should be sustained. 



Respectfully submitted, 
Paras Shah 
01/21/2010 

Conferees: 

/David R Hudspeth/ 

Supervisory Patent Examiner, Art Unit 2626 



James Wozniak 

/James S. Wozniak/ 

Primary Examiner, Art Unit 2626 



/Paras Shah/ 
Examiner, Art Unit 2626 
01/21/2010 



